15 research outputs found

    Learning control policies of driverless vehicles from UAV video streams in complex urban environments

    Get PDF
    © 2019 by the authors. The way we drive, and the transport of today are going through radical changes. Intelligent mobility envisions to improve the e°ciency of traditional transportation through advanced digital technologies, such as robotics, artificial intelligence and Internet of Things. Central to the development of intelligent mobility technology is the emergence of connected autonomous vehicles (CAVs) where vehicles are capable of navigating environments autonomously. For this to be achieved, autonomous vehicles must be safe, trusted by passengers, and other drivers. However, it is practically impossible to train autonomous vehicles with all the possible tra°c conditions that they may encounter. The work in this paper presents an alternative solution of using infrastructure to aid CAVs to learn driving policies, specifically for complex junctions, which require local experience and knowledge to handle. The proposal is to learn safe driving policies through data-driven imitation learning of human-driven vehicles at a junction utilizing data captured from surveillance devices about vehicle movements at the junction. The proposed framework is demonstrated by processing video datasets captured from uncrewed aerial vehicles (UAVs) from three intersections around Europe which contain vehicle trajectories. An imitation learning algorithm based on long short-term memory (LSTM) neural network is proposed to learn and predict safe trajectories of vehicles. The proposed framework can be used for many purposes in intelligent mobility, such as augmenting the intelligent control algorithms in driverless vehicles, benchmarking driver behavior for insurance purposes, and for providing insights to city planning

    Analysis by synthesis spatial audio coding

    Get PDF
    This study presents a novel spatial audio coding (SAC) technique, called analysis by synthesis SAC (AbS-SAC), with a capability of minimising signal distortion introduced during the encoding processes. The reverse one-to-two (R-OTT), a module applied in the MPEG Surround to down-mix two channels as a single channel, is first configured as a closed-loop system. This closed-loop module offers a capability to reduce the quantisation errors of the spatial parameters, leading to an improved quality of the synthesised audio signals. Moreover, a sub-optimal AbS optimisation, based on the closed-loop R-OTT module, is proposed. This algorithm addresses a problem of practicality in implementing an optimal AbS optimisation while it is still capable of improving further the quality of the reconstructed audio signals. In terms of algorithm complexity, the proposed sub-optimal algorithm provides scalability. The results of objective and subjective tests are presented. It is shown that significant improvement of the objective performance, when compared to the conventional open-loop approach, is achieved. On the other hand, subjective test show that the proposed technique achieves higher subjective difference grade scores than the tested advanced audio coding multichannel

    IoT driven ambient intelligence architecture for indoor intelligent mobility

    Get PDF
    Personal robots are set to assist humans in their daily tasks. Assisted living is one of the major applications of personal assistive robots, where the robots will support health and wellbeing of the humans in need, especially elderly and disabled. Indoor environments are extremely challenging from a robot perception and navigation point of view, because of the ever-changing decorations, internal organizations and clutter. Furthermore, human-robot-interaction in personal assistive robots demands intuitive and human-like intelligence and interactions. Above challenges are aggravated by stringent and often tacit requirements surrounding personal privacy that may be invaded by continuous monitoring through sensors. Towards addressing the above problems, in this paper we present an architecture for "Ambient Intelligence" for indoor intelligent mobility by leveraging IoTs within a framework of Scalable Multi-layered Context Mapping Framework. Our objective is to utilize sensors in home settings in the least invasive manner for the robot to learn about its dynamic surroundings and interact in a human-like manner. The paper takes a semi-survey approach to presenting and illustrating preliminary results from our in-house built fully autonomous electric quadbike

    Information system security reinforcement with WGAN-GP for detection of zero-day attacks

    No full text
    Growing sophistication among cyber threats has posed increasing challenges to the security and reliability of information systems, especially in the face of zero-day attacks that exploit unknown vulnerabilities. This paper introduces an innovative application of Artificial Intelligence (AI), specifically the adoption of Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP), to support Intrusion Detection Systems (IDS) to strengthen defences against such attacks. This research focuses on using the WGAN-GP to generate network traffic data in simulating the unpredictable patterns of zero-day attacks. It utilises the widely used network traffic dataset NSL-KDD to conduct data expansion. This approach leverages data generated by the WGAN-GP to train detection systems, enabling them to learn and identify subtle signatures of zero-day attacks. Experimental evaluation demonstrates that the WGAN-GP model can improve the accuracy of zero-day attack detection. In comparison to other methods, such as Convolutional Neural Networks (CNN), the detection accuracy is increased by 2.3% and 2% for binary and multi-classification, respectively. This work shows that combining IDS with advanced generative AI models, such as WGAN-GP, can significantly enhance the security of information systems in identifying and mitigating risks posed by zero-day attacks.</p

    Adaptive blind moving source separation based on intensity vector statistics

    No full text
    This paper presents a novel approach to blind moving source separation by detecting, tracking and separating speakers in real-time using intensity vector direction (IVD) statistics. It updates unmixing system parameters swiftly in order to deal with the time-variant mixing parameters. Denoising is carried out to extract reliable speaker estimates using von-Mises modeling of the IVD measurements in space and IIR filtering of the IVD distribution in time. Peaks in the IVD distribution are assigned location expectation values to check for consistency, and consequently high location expectation peaks are declared as active speakers. The location expectation algorithm caters for natural pauses during speech delivery. Speaker movements are tracked by spatial isolation of the detected peaks using time-variant regions of interest. As a result, the proposed moving source separation system is capable of blindly detecting, tracking and separating moving speakers. A real-time demonstration has been developed with the proposed system pipeline, allowing users to listen to active speakers in any desired combination. The system has an advantage of using a small coincident microphone array to separate any number of moving sources utilising the first order Ambisonics signals while assuming source signals to be W-disjoint orthogonal. Being nearly closed-form, the proposed system does not require convergence or initialization of parameters

    Information system security reinforcement with WGAN-GP for detection of zero-day attacks

    No full text
    Growing sophistication among cyber threats has posed increasing challenges to the security and reliability of information systems, especially in the face of zero-day attacks that exploit unknown vulnerabilities. This paper introduces an innovative application of Artificial Intelligence (AI), specifically the adoption of Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP), to support Intrusion Detection Systems (IDS) to strengthen defences against such attacks. This research focuses on using the WGAN-GP to generate network traffic data in simulating the unpredictable patterns of zero-day attacks. It utilises the widely used network traffic dataset NSL-KDD to conduct data expansion. This approach leverages data generated by the WGAN-GP to train detection systems, enabling them to learn and identify subtle signatures of zero-day attacks. Experimental evaluation demonstrates that the WGAN-GP model can improve the accuracy of zero-day attack detection. In comparison to other methods, such as Convolutional Neural Networks (CNN), the detection accuracy is increased by 2.3% and 2% for binary and multi-classification, respectively. This work shows that combining IDS with advanced generative AI models, such as WGAN-GP, can significantly enhance the security of information systems in identifying and mitigating risks posed by zero-day attacks.</p

    A model selection algorithm for complex CNN systems based on feature-weights relation

    No full text
    In object recognition using machine learning, one model cannot practically be trained to identify all the possible objects it encounters. An ensemble of models may be needed to cater to a broader range of objects. Building a mathematical understanding of the relationship between various objects that share comparable outlined features is envisaged as an effective method of improving the model ensemble through a pre-processing stage, where these objects' features are grouped under a broader classification umbrella. This paper proposes a mechanism to train an ensemble of recognition models coupled with a model selection scheme to scale-up object recognition in a multi-model system. The multiple models are built with a CNN structure, whereas the image features are extracted using a CNN/VGG16 architecture. Based on the models' excitation weights, a neural network model selection algorithm, which decides how close the features of the object are to the trained models for selecting a particular model for object recognition is tested on a multi-model neural network platform. The experiment results show the proposed model selection scheme is highly effective and accurate in selecting an appropriate model for a network of multiple models.</p

    Conversational emotion detection and elicitation: a preliminary study

    No full text
    Emotion recognition in conversation is a challenging task as it requires an understanding of the contextual and linguistic aspects of a conversation. Emotion recognition in speech has been well studied, but in bi-directional or multi-directional conversations, emotions can be very complex, mixed, and embedded in context. To tackle this challenge, we propose a method that combines state-of-the-art RoBERTa model (robustly optimized BERT pretraining approach) with a Bidirectional long short-term memory (BiLSTM) network for contextualized emotion recognition. RoBERTa is a transformer-based language model, which is an advanced version of the well-known BERT. We use RoBERTa features as input to a BiLSTM model that learns to capture contextual dependencies and sequential patterns in the input text. The proposed model is trained and evaluated on a Multimodal EmotionLines Dataset (MELD) to recognize emotions in conversation. The textual modality of the dataset is utilized for the experimental evaluation, with the weighted average F1 score and accuracy used as performance metrics. The experimental results indicate that the incorporation of a pre-trained transformer-based language model with a BiLSTM network significantly enhances the recognition of emotions in contextualized conversational settings.</p

    Policy generation from latent embeddings for reinforcement learning

    No full text
    The human brain endows us with extraordinary capabilities that enable us to create, imagine, and generate anything we desire. Specifically, we have fascinating imaginative skills allowing us to generate fundamental knowledge from abstract concepts. Motivated by these traits, numerous areas of machine learning, notably unsupervised learning and reinforcement learning, have started using such ideas at their core. Nevertheless, these methods do not come without fault. A fundamental issue with reinforcement learning especially now when used with neural networks as function approximators is their limited achievable optimality compared to its uses from tabula rasa. Due to the nature of learning with neural networks, the behaviours achievable for each task are inconsistent and providing a unified approach that enables such optimal policies to exist within a parameter space would facilitate both the learning procedure and the behaviour outcomes. Consequently, we are interested in discovering whether reinforcement learning can be facilitated with unsupervised learning methods in a manner to alleviate this downfall. This work aims to provide an analysis of the feasibility of using generative models to extract learnt reinforcement learning policies (i.e. model parameters) with the intention of conditionally sampling the learnt policy-latent space to generate new policies. We demonstrate that under the current proposed architecture, these models are able to recreate policies on simple tasks whereas fail on more complex ones. We therefore provide a critical analysis of these failures and discuss further improvements which would aid the proliferation of this work

    Understanding dilated mathematical relationship between image features and the convolutional neural network’s learnt parameters

    No full text
    Deep learning, in general, was built on input data transformation and presentation, model training with parameter tuning, and recognition of new observations using the trained model. However, this came with a high computation cost due to the extensive input database and the length of time required in training. Despite the model learning its parameters from the transformed input data, no direct research has been conducted to investigate the mathematical relationship between the transformed information (i.e., features, excitation) and the model’s learnt parameters (i.e., weights). This research aims to explore a mathematical relationship between the input excitations and the weights of a trained convolutional neural network. The objective is to investigate three aspects of this assumed feature-weight relationship: (1) the mathematical relationship between the training input images’ features and the model’s learnt parameters, (2) the mathematical relationship between the images’ features of a separate test dataset and a trained model’s learnt parameters, and (3) the mathematical relationship between the difference of training and testing images’ features and the model’s learnt parameters with a separate test dataset. The paper empirically demonstrated the existence of this mathematical relationship between the test image features and the model’s learnt weights by the ANOVA analysis
    corecore